Statistical Machine Translation for Query Expansion in Answer Retrieval

نویسندگان

  • Stefan Riezler
  • Alexander Vasserman
  • Ioannis Tsochantaridis
  • Vibhu O. Mittal
  • Yi Liu
چکیده

We present an approach to query expansion in answer retrieval that uses Statistical Machine Translation (SMT) techniques to bridge the lexical gap between questions and answers. SMT-based query expansion is done by i) using a full-sentence paraphraser to introduce synonyms in context of the entire query, and ii) by translating query terms into answer terms using a full-sentence SMT model trained on question-answer pairs. We evaluate these global, context-aware query expansion techniques on tfidf retrieval from 10 million question-answer pairs extracted from FAQ pages. Experimental results show that SMTbased expansion improves retrieval performance over local expansion and over retrieval without expansion.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Improvement in Cross-Language Document Retrieval Based on Statistical Models

This paper presents a proposed method integrated with three statistical models including Translation model, Query generation model and Document retrieval model for cross-language document retrieval. Given a certain document in the source language, it will be translated into the target language of statistical machine translation model. The query generation model then selects the most relevant wo...

متن کامل

Combining lexical and statistical translation evidence for cross-language information retrieval

This paper explores how best to use lexical and statistical translation evidence together for CrossLanguage Information Retrieval (CLIR). Lexical translation evidence is assembled from Wikipedia and from a large machine readable dictionary, statistical translation evidence is drawn from parallel corpora, and evidence from co-occurrence in the document language provides a basis for limiting the ...

متن کامل

Cross-Language Retrieval Using HAIRCUT for CLEF 2004

JHU/APL continued to explore the use of knowledge-light methods for scalable multilingual retrieval during the CLEF 2004 evaluation. We relied on the language-neutral techniques of character n-gram tokenization, pre-translation query expansion, statistical translation using aligned parallel corpora, fusion from disparate retrievals, and reliance on language similarity when resources are scarce....

متن کامل

QEA: A New Systematic and Comprehensive Classification of Query Expansion Approaches

A major problem in information retrieval is the difficulty to define the information needs of user and on the other hand, when user offers your query there is a vast amount of information to retrieval. Different methods , therefore, have been suggested for query expansion which concerned with reconfiguring of query by increasing efficiency and improving the criterion accuracy in the information...

متن کامل

DCU's Experiments for the NTCIR-8 IR4QA Task

We describe DCU’s participation in the NTCIR-8 IR4QA task [16]. This task is a cross-language information retrieval (CLIR) task from English to Simplified Chinese which seeks to provide relevant documents for later cross language question answering (CLQA) tasks. For the IR4QA task, we submitted 5 official runs including two monolingual runs and three CLIR runs. For the monolingual retrieval we ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007